algorithm 1
- North America > United States > California > Santa Clara County > Palo Alto (0.05)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (3 more...)
- Asia > Middle East > Saudi Arabia (0.04)
- North America > United States > Connecticut > New Haven County > New Haven (0.04)
- North America > Canada (0.04)
- (5 more...)
- North America > Canada > Quebec > Montreal (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- North America > United States > Pennsylvania (0.04)
- North America > Canada (0.04)
- Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
When to Trust the Cheap Check: Weak and Strong Verification for Reasoning
Kiyani, Shayan, Noorani, Sima, Pappas, George, Hassani, Hamed
Reasoning with LLMs increasingly unfolds inside a broader verification loop. Internally, systems use cheap checks, such as self-consistency or proxy rewards, which we call weak verification. Externally, users inspect outputs and steer the model through feedback until results are trustworthy, which we call strong verification. These signals differ sharply in cost and reliability: strong verification can establish trust but is resource-intensive, while weak verification is fast and scalable but noisy and imperfect. We formalize this tension through weak--strong verification policies, which decide when to accept or reject based on weak verification and when to defer to strong verification. We introduce metrics capturing incorrect acceptance, incorrect rejection, and strong-verification frequency. Over population, we show that optimal policies admit a two-threshold structure and that calibration and sharpness govern the value of weak verifiers. Building on this, we develop an online algorithm that provably controls acceptance and rejection errors without assumptions on the query stream, the language model, or the weak verifier.
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > Singapore > Central Region > Singapore (0.04)
- Asia > Middle East > Jordan (0.04)
Fast Bellman Updates for Wasserstein Distributionally Robust MDPs
Markov decision processes (MDPs) often suffer from the sensitivity issue under model ambiguity. In recent years, robust MDPs have emerged as an effective framework to overcome this challenge. Distributionally robust MDPs extend the robust MDP framework by incorporating distributional information of the uncertain model parameters to alleviate the conservative nature of robust MDPs.
- Asia > China > Hong Kong (0.04)
- North America > United States > Massachusetts (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Washington > King County > Bellevue (0.04)
- North America > United States > California > Los Angeles County > Santa Monica (0.04)
- (3 more...)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Monaco (0.04)
- Europe > Italy > Calabria (0.04)
- (2 more...)
- Oceania > Australia (0.04)
- North America > United States (0.04)
- Asia > Middle East > Jordan (0.04)